Asymptotic properties of constrained Markov Decision Processes
نویسنده
چکیده
We present in this paper several asymptotic properties of constrained Markov Decision Processes (MDPs) with a countable state space. We treat both the discounted and the expected average cost, with unbounded cost. We are interested in (1) the convergence of nite horizon MDPs to the innnite horizon MDP, (2) convergence of MDPs with a truncated state space to the problem with innnite state space, (3) convergence of MDPs as the discount factor goes to a limit. In all these cases we establish the convergence of optimal values and policies. Moreover, based on the optimal policy for the limiting problem, we construct policies which are almost optimal for the other (approximating) problems.
منابع مشابه
Constrained Markovian dynamics of random graphs
We introduce a statistical mechanics formalism for the study of constrained graph evolution as a Markovian stochastic process, in analogy with that available for spin systems, deriving its basic properties and highlighting the role of the ‘mobility’ (the number of allowed moves for any given graph). As an application of the general theory we analyze the properties of degree-preserving Markov ch...
متن کاملAccelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملMetrics for Labeled Markov Systems
Partial Labeled Markov Chains are simultaneously generalizations of process algebra and of traditional Markov chains. They provide a foundation for interacting discrete probabilistic systems, the interaction being synchronization on labels as in process algebra. Existing notions of process equivalence are too sensitive to the exact probabilities of various transitions. This paper addresses cont...
متن کاملAlgebraic System Analysis of Timed Petri Nets
We show that Continuous Timed Petri Nets (CTPN) can be modeled by generalized polynomial recurrent equations in the (min,+) semiring. We establish a correspondence between CTPN and Markov decision processes. We survey the basic system theoretical results available: behavioral (inputoutput) properties, algebraic representations, asymptotic regime. A particular attention is paid to the subclass o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- ZOR - Meth. & Mod. of OR
دوره 37 شماره
صفحات -
تاریخ انتشار 1993